Picture for Zhaoyang Wang

Zhaoyang Wang

Can RL Teach Long-Horizon Reasoning to LLMs? Expressiveness Is Key

Add code
May 07, 2026
Viaarxiv icon

WebXSkill: Skill Learning for Autonomous Web Agents

Add code
Apr 14, 2026
Viaarxiv icon

Mitigating Entangled Steering in Large Vision-Language Models for Hallucination Reduction

Add code
Apr 09, 2026
Viaarxiv icon

Dual-Loop Control in DCVerse: Advancing Reliable Deployment of AI in Data Centers via Digital Twins

Add code
Apr 08, 2026
Viaarxiv icon

Provable and Practical In-Context Policy Optimization for Self-Improvement

Add code
Mar 02, 2026
Viaarxiv icon

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

Add code
Feb 25, 2026
Viaarxiv icon

Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

Add code
Feb 11, 2026
Viaarxiv icon

Reliable and Responsible Foundation Models: A Comprehensive Survey

Add code
Feb 04, 2026
Viaarxiv icon

Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction

Add code
Nov 14, 2025
Figure 1 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Figure 2 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Figure 3 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Figure 4 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Viaarxiv icon

Adapting Web Agents with Synthetic Supervision

Add code
Nov 08, 2025
Viaarxiv icon